Search CORE

18 research outputs found

Unordered feature tracking made fast and easy

Author: Monasse Pascal
Moulon Pierre
Publication venue: HAL CCSD
Publication date: 01/01/2012
Field of study

International audienceWe present an efficient algorithm to fuse two-view correspondences into multi-view consistent tracks. The proposed method relies on the Union-Find algorithm to solve the fusion problem. It is very simple and has a lower computational complexity than other available methods. Our experiments show that it is faster and computes more tracks

CiteSeerX

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Python Photogrammetry Toolbox: A free solution for Three-Dimensional Documentation

Author: Bezzi Alessandro
Moulon Pierre
Publication venue: HAL CCSD
Publication date: 09/06/2011
Field of study

International audienceThe modern techniques of Structure from Motion (SfM) and Image-Based Modelling (IBM) open new perspectives in the field of archaeological documentation, providing a simple and accurate way to record three-dimensional data. In the last edition of the workshop, the presentation "Computer Vision and Structure From Motion, new methodologies in archaeological three-dimensional documentation. An open source approach." showed the advantages of this new methodology (low cost, portability, versatility ...), but it also identified some problems: the use of the closed feature detector SIFT source code and the necessity of a simplification of the workflow. The software Python Photogrammetry Toolbox (PPT) is a possible solution to solve these problems. It is composed of python scripts that automate the different steps of the workflow. The entire process is reduced in two commands, calibration and dense reconstruction. The user can run it from a graphical interface or from terminal command

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Positionnement robuste et précis de réseaux d’images

Author: Moulon Pierre
Publication venue: HAL CCSD
Publication date: 10/01/2014
Field of study

To compute a 3D representation of a rigid scene from a collection of pictures is now possible thanks to the progress made by the multiple-view stereovision methods, even with a simple camera. The reconstruction process, arising from photogrammetry, consists in integrating information from multiple images taken from different viewpoints in order to identify the relative positions and orientations. Once the positions and orientations (external calibration) of the cameras are retrieved, the structure of the scene can be reconstructed. To solve the problem of calculating the Structure from Motion (SfM), sequential and global methods have been proposed. By nature, sequential methods tend to accumulate errors. This is observable in trajectories of cameras that are subject to drift error. When pictures are acquired around an object it leads to reconstructions where the loops do not close. In contrast, global methods consider the network of cameras as a whole. The configuration of cameras is searched and optimized in order to preserve at best the constraints of the cyclical network. Reconstructions of better quality can be obtained, but at the expense of computation time. This thesis aims at analyzing critical issues at the heart of these methods of external calibration and at providing solutions to improve their performance(accuracy , robustness and speed) and their ease of use (restricted parametrization).We first propose a fast and efficient feature tracking algorithm. We then show that the widespread use of a contrario robust estimation of parametric models frees the user from choosing detection thresholds, and allows obtaining a reconstruction pipeline that automatically adapts to the data. Then in a second step, we use the adaptive robust estimation and a series of convex optimizations to build a scalable global calibration chain. Our experiments show that the a contrario based estimations improve significantly the quality of the pictures positions and orientations, while being automatic and without parameters, even on complex camera networks. Finally, we propose to improve the visual appearance of the reconstruction by providing a convex optimization to ensure the color consistency between imagesCalculer une représentation 3D d'une scène rigide à partir d'une collection d'images est aujourd'hui possible grâce aux progrès réalisés par les méthodes de stéréo-vision multi-vues, et ce avec un simple appareil photographique. Le principe de reconstruction, découlant de travaux de photogrammétrie, consiste à recouper les informations provenant de plusieurs images, prises de points de vue différents, pour identifier les positions et orientations relatives de chaque cliché. Une fois les positions et orientations de caméras déterminées (calibration externe), la structure de la scène peut être reconstruite. Afin de résoudre le problème de calcul de la structure à partir du mouvement des caméras (Structure-from-Motion), des méthodes séquentielles et globales ont été proposées. Par nature, les méthodes séquentielles ont tendance à accumuler les erreurs. Cela donne lieu le plus souvent à des trajectoires de caméras qui dérivent et, lorsque les photos sont acquises autour d'un objet, à des reconstructions où les boucles ne se referment pas. Au contraire, les méthodes globales considèrent le réseau de caméras dans son ensemble. La configuration de caméras est recherchée et optimisée pour conserver au mieux l'ensemble des contraintes de cyclicité du réseau. Des reconstructions de meilleure qualité peuvent être obtenues, au détriment toutefois du temps de calcul. Cette thèse propose d'analyser des problèmes critiques au cœur de ces méthodes de calibration externe et de fournir des solutions pour améliorer leur performance (précision, robustesse, vitesse) et leur facilité d'utilisation (paramétrisation restreinte).Nous proposons tout d'abord un algorithme de suivi de points rapide et efficace. Nous montrons ensuite que l'utilisation généralisée de l'estimation robuste de modèles paramétriques a contrario permet de libérer l'utilisateur du réglage de seuils de détection, et d'obtenir une chaine de reconstruction qui s'adapte automatiquement aux données. Puis dans un second temps, nous utilisons ces estimations robustes adaptatives et une formulation du problème qui permet des optimisations convexes pour construire une chaine de calibration globale capable de passer à l'échelle. Nos expériences démontrent que les estimations identifiées a contrario améliorent de manière notable la qualité d'estimation de la position et de l'orientation des clichés, tout en étant automatiques et sans paramètres, et ce même sur des réseaux de caméras complexes. Nous proposons enfin d'améliorer le rendu visuel des reconstructions en proposant une optimisation convexe de la consistance colorée entre image

Thèses en Ligne

thèses en ligne de ParisTech

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Estimation robuste de modèle a contrario, impact sur la précision en structure from motion

Author: Marlet Renaud
Monasse Pascal
Moulon Pierre
Publication venue: HAL CCSD
Publication date: 07/02/2013
Field of study

L'estimation de modèle consiste à identiﬁer un modèle parmi des données bruitées. Ce problème n'est pas trivial et l'état de l'art présente de nombreuses solutions pour résoudre ce problème. Le plus souvent les solutions max-consensus ou RANSAC sont utilisées. Ces solutions proposent de rechercher par tirages aléatoires plusieurs solutions et de conserver celle qui présente le plus grand cardinal face à une précision donnée a priori. Ces solutions, malgrè leur simplicité, présentent un défaut majeur : un seuil d'acception des données T doit être spéciﬁé. Il se pose alors la question du choix de ce paramètre. Choisir un seuil trop grand va donner lieu à une sur-estimation des données valides et l'on va introduire des données bruitées dans le modèle alors que choisir un seuil trop petit donne lieu à une sous-estimation et une imprécision du modèle. Nous proposons de discuter la solution AC-RANSAC pour le Structure from Motion et son impact sur la précision des positions de caméras estimées

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Chat2Map: Efficient Scene Mapping from Multi-Ego Conversations

Author: Calamia Paul
Grauman Kristen
Henderson Ethan
Ithapu Vamsi Krishna
Jiang Hao
Majumder Sagnik
Moulon Pierre
Publication venue
Publication date: 20/04/2023
Field of study

Can conversational videos captured from multiple egocentric viewpoints reveal the map of a scene in a cost-efficient way? We seek to answer this question by proposing a new problem: efficiently building the map of a previously unseen 3D environment by exploiting shared information in the egocentric audio-visual observations of participants in a natural conversation. Our hypothesis is that as multiple people ("egos") move in a scene and talk among themselves, they receive rich audio-visual cues that can help uncover the unseen areas of the scene. Given the high cost of continuously processing egocentric visual streams, we further explore how to actively coordinate the sampling of visual information, so as to minimize redundancy and reduce power use. To that end, we present an audio-visual deep reinforcement learning approach that works with our shared scene mapper to selectively turn on the camera to efficiently chart out the space. We evaluate the approach using a state-of-the-art audio-visual simulator for 3D scenes as well as real-world video. Our model outperforms previous state-of-the-art mapping methods, and achieves an excellent cost-accuracy tradeoff. Project: http://vision.cs.utexas.edu/projects/chat2map.Comment: Accepted to CVPR 202

arXiv.org e-Print Archive

Robust and accurate calibration of camera networks

Author: Moulon Pierre
Publication venue
Publication date: 10/01/2014
Field of study

Calculer une représentation 3D d'une scène rigide à partir d'une collection d'images est aujourd'hui possible grâce aux progrès réalisés par les méthodes de stéréo-vision multi-vues, et ce avec un simple appareil photographique. Le principe de reconstruction, découlant de travaux de photogrammétrie, consiste à recouper les informations provenant de plusieurs images, prises de points de vue différents, pour identifier les positions et orientations relatives de chaque cliché. Une fois les positions et orientations de caméras déterminées (calibration externe), la structure de la scène peut être reconstruite. Afin de résoudre le problème de calcul de la structure à partir du mouvement des caméras (Structure-from-Motion), des méthodes séquentielles et globales ont été proposées. Par nature, les méthodes séquentielles ont tendance à accumuler les erreurs. Cela donne lieu le plus souvent à des trajectoires de caméras qui dérivent et, lorsque les photos sont acquises autour d'un objet, à des reconstructions où les boucles ne se referment pas. Au contraire, les méthodes globales considèrent le réseau de caméras dans son ensemble. La configuration de caméras est recherchée et optimisée pour conserver au mieux l'ensemble des contraintes de cyclicité du réseau. Des reconstructions de meilleure qualité peuvent être obtenues, au détriment toutefois du temps de calcul. Cette thèse propose d'analyser des problèmes critiques au cœur de ces méthodes de calibration externe et de fournir des solutions pour améliorer leur performance (précision, robustesse, vitesse) et leur facilité d'utilisation (paramétrisation restreinte).Nous proposons tout d'abord un algorithme de suivi de points rapide et efficace. Nous montrons ensuite que l'utilisation généralisée de l'estimation robuste de modèles paramétriques a contrario permet de libérer l'utilisateur du réglage de seuils de détection, et d'obtenir une chaine de reconstruction qui s'adapte automatiquement aux données. Puis dans un second temps, nous utilisons ces estimations robustes adaptatives et une formulation du problème qui permet des optimisations convexes pour construire une chaine de calibration globale capable de passer à l'échelle. Nos expériences démontrent que les estimations identifiées a contrario améliorent de manière notable la qualité d'estimation de la position et de l'orientation des clichés, tout en étant automatiques et sans paramètres, et ce même sur des réseaux de caméras complexes. Nous proposons enfin d'améliorer le rendu visuel des reconstructions en proposant une optimisation convexe de la consistance colorée entre imagesTo compute a 3D representation of a rigid scene from a collection of pictures is now possible thanks to the progress made by the multiple-view stereovision methods, even with a simple camera. The reconstruction process, arising from photogrammetry, consists in integrating information from multiple images taken from different viewpoints in order to identify the relative positions and orientations. Once the positions and orientations (external calibration) of the cameras are retrieved, the structure of the scene can be reconstructed. To solve the problem of calculating the Structure from Motion (SfM), sequential and global methods have been proposed. By nature, sequential methods tend to accumulate errors. This is observable in trajectories of cameras that are subject to drift error. When pictures are acquired around an object it leads to reconstructions where the loops do not close. In contrast, global methods consider the network of cameras as a whole. The configuration of cameras is searched and optimized in order to preserve at best the constraints of the cyclical network. Reconstructions of better quality can be obtained, but at the expense of computation time. This thesis aims at analyzing critical issues at the heart of these methods of external calibration and at providing solutions to improve their performance(accuracy , robustness and speed) and their ease of use (restricted parametrization).We first propose a fast and efficient feature tracking algorithm. We then show that the widespread use of a contrario robust estimation of parametric models frees the user from choosing detection thresholds, and allows obtaining a reconstruction pipeline that automatically adapts to the data. Then in a second step, we use the adaptive robust estimation and a series of convex optimizations to build a scalable global calibration chain. Our experiments show that the a contrario based estimations improve significantly the quality of the pictures positions and orientations, while being automatic and without parameters, even on complex camera networks. Finally, we propose to improve the visual appearance of the reconstruction by providing a convex optimization to ensure the color consistency between image

Theses.fr

Automatic Homographic Registration of a Pair of Images, with A Contrario Elimination of Outliers

Author: Moisan Lionel
Monasse Pascal
Moulon Pierre
Publication venue: 'Image Processing On Line'
Publication date: 19/05/2012
Field of study

International audienceThe RANSAC algorithm (RANdom SAmple Consensus) is a robust method to estimate parameters of a model fitting the data, in presence of outliers among the data. Its random nature is due only to complexity considerations. It iteratively extracts a random sample out of all data, of minimal size sufficient to estimate the parameters. At each such trial, the number of inliers (data that fits the model within an acceptable error threshold) is counted. In the end, the set of parameters maximizing the number of inliers is accepted. The variant proposed by Moisan and Stival consists in introducing an a contrario criterion to avoid the hard thresholds for inlier/outlier discrimination. It has three consequences: The threshold for inlier/outlier discrimination is adaptive, it does not need to be fixed. It gives a decision on the adequacy of the final model: it does not provide a wrong set of parameters if it does not have enough confidence. The procedure to draw a new sample can be amended as soon as one set of parameters is deemed meaningful: the new sample can be drawn among the inliers of this model. In this particular instantiation, we apply it to the estimation of the homography registering two images of the same scene. The homography is an 8-parameter model arising in two situations when using a pinhole camera: the scene is planar (a painting, a facade, etc.) or the viewpoint location is fixed (pure rotation around the optical center). When the homography is found, it is used to stitch the images in the coordinate frame of the second image and build a panorama. The point correspondences between images are computed by the SIFT algorithm

Directory of Open Access Journals

HAL Descartes

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

Global Fusion of Relative Motions for Robust, Accurate and Scalable Structure from Motion

Author: Marlet Renaud
Monasse Pascal
Moulon Pierre
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceMulti-view structure from motion (SfM) estimates the position and orientation of pictures in a common 3D coordinate frame. When views are treated incrementally, this external calibration can be subject to drift, contrary to global methods that distribute residual errors evenly. We propose a new global calibration approach based on the fusion of relative motions between image pairs. We improve an existing method for robustly computing global rotations. We present an efficient a contrario trifocal tensor estimation method, from which stable and precise translation directions can be extracted. We also define an efficient translation registration method that recovers accurate camera positions. These components are combined into an original SfM pipeline. Our experiments show that, on most datasets, it outperforms in accuracy other existing incremental and global pipelines. It also achieves strikingly good running times: it is about 20 times faster than the other global method we could compare to, and as fast as the best incremental method. More importantly, it features better scalability properties

CiteSeerX

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM